arXiv:1507.07280vl [stat.ME] 27 Jul 2015 


Submitted to the Annals of Statistics 


EFFICIENT CALIBRATION FOR IMPERFECT 
COMPUTER MODELS 

By Rui Tuo*’^ and C. F. Jeff Wu^’§ 

Chinese Academy of Sciences ^ and Georgia Institute of Technology ^ 

Many computer models contain unknown parameters which need 
to be estimated using physical observations. Tuo and Wu (2014) shows 
that the calibration method based on Gaussian process models pro¬ 
posed by Kennedy and O’Hagan (2001) may lead to unreasonable 
estimate for imperfect computer models. In this work, we extend 
their study to calibration problems with stochastic physical data. 

We propose a novel method, called the L 2 calibration, and show its 
semiparametric efficiency. The conventional method of the ordinary 
least squares is also studied. Theoretical analysis shows that it is con¬ 
sistent but not efficient. Numerical examples show that the proposed 
method outperforms the existing ones. 


1. Introduction. Computer simulations are widely used by researchers 
and engineers to understand, predict, or control complex systems. Many 
physical phenomena and processes can be modeled with mathematical tools, 
like partial differential equations. These mathematical models are solved by 
numerical algorithms like the finite element method. For example, computer 
simulations can help predict the trajectory of a storm. In engineering, com¬ 
puter simulations have become more popular and sometimes indispensable 
in product and process designs. The design and analysis of experiments is a 
classic area of statistics. A new field has emerged, which considers the design 
and analysis for experiments in computer simulations, commonly referred 
to as “computer experiments”. Unlike the physical experiments, computer 
experiments are usually deterministic. In addition, the input variables for 
computer experiments usually take real values, not discrete levels as in many 
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physical experiments. Therefore, interpolation methods are widely used in 
computer experiments, while conventional methods, like ANOVA or regres¬ 
sion models, are used much less often. 

In many computer experiments, some of the input parameters represent 
certain inherent attributes of the physical system. The true values of these 
variables are unknown because there may not be enough knowledge about 
the physical systems. For instance, in underground water simulations, the 
soil permeability is an important input parameter, but its true value is usu¬ 
ally unknown. A standard approach to identify the unknown model parame¬ 
ters is known as calibration. To calibrate the unknown parameters, one need 
to run the computer model under different model parameters, and run some 
physical experiments. The basic idea of calibration is to find the combina¬ 
tion of the model parameters, under which the computer outputs match the 
physical responses. 

One important topic in the calibration of computer models is to tackle 
the model uncertainty. Most physical models are built under certain as¬ 
sumptions or simplifications, which may not hold in reality. As a result, the 
computer output can rarely fit the physical response perfectly, even if the 
true values of the calibration parameters are known. We call such computer 
models imperfect. Kennedy and O’Hagan (2001) first discusses this model 
uncertainty problem and proposes a Bayesian method, which models the dis¬ 
crepancy between the physical process and the computer output as a Gaus¬ 
sian process. Because of the importance of calibration for computer models, 
the Kennedy-O’Hagan’s approach has been widely used, including hydrol¬ 
ogy, radiological protection, cylinder implosion, spot welding, micro-cutting, 
climate prediction and cosmology. See Higdon et al. (2004, 2008, 2013), 
Bayarri et al. (2007a,b), Joseph and Melkote (2009), Wang, Chen and Tsui 
(2009), Han, Santner and Rawlinson (2009), Goldstein and Rougier (2004) 
and Murphy et al. (2007), Goh et al. (2013). 

Tuo and Wu (2014) studies the asymptotic properties of the calibration 
parameter estimators given by Kennedy and O’Hagan (2001) and shows that 
their method can lead to unreasonable estimate. The first theoretical frame¬ 
work for calibration problems of the Kennedy-O’Hagan type is established 
by Tuo and Wu (2014) under the assumption that the physical responses 
have no random error. This assumption is needed to make the mathemati¬ 
cal analysis for a version of the Kennedy-O’Hagan’s approach feasible. Given 
the fact that the responses in physical experiments are rarely deterministic, 
it is necessary to extend the study to cases where the physical responses 
have measurement or observational errors. For convenience, we use the term 
“stochastic physical experiments” to denote physical responses with random 
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errors. 

In Tuo and Wu (2014), the theory of native spaces is used to derive the 
convergence rate for calibration with deterministic physical systems. Be¬ 
cause of the random error in the current context, the interpolation theory 
fails to work. In this work we will mainly use mathematical tools of weak 
convergence, including the limiting theory of empirical processes. 

The main theme of this article is to propose a general framework for cal¬ 
ibration and provide an efficient estimator for the calibration parameter. 
We utilize a nonparametric regression method to model the physical out¬ 
puts. Similar models are also considered in the literature of response surface 
methodology. See Myers (1999) and Anderson-Cook and Prewitt (2005). To 
estimate the calibration parameter, we extend the L 2 calibration method 
proposed by Tuo and Wu (2014) to the present context. This novel method 
is proven to be semiparametric efficient when the measurement error follows 
a normal distribution. A conventional method, namely, the ordinary least 
squares method, is also studied, and shown to be consistent but not efficient. 

This paper is organized as follows. In Section 2, we extend the L 2 projec¬ 
tion dehned by Tuo and Wu (2014) to stochastic systems and propose the 
L 2 calibration method in the current context. In Section 3, the asymptotic 
behavior for L 2 calibration is studied. In Section 4, we consider the ordinary 
least squares method. The proposed method is illustrated and its perfor¬ 
mance studied in two numerical examples in Section 5. Concluding remarks 
are given in Section 6. 

2. L 2 Projection for Systems with Stochastic Physical Experi¬ 
ments. Let Q denote the region of interest for the control variables, which 
is a convex and compact subset of R'^. Let xi,... ,Xn be a set of points 
on n. Suppose the physical experiment is conducted once on each Xi, with 
the corresponding response denoted by yf, for i = 1,... ,n, where the super¬ 
script p stands for “physical”. In this work, we assume the physical system is 
stochastic, that is, the physical responses have random measurement or ob¬ 
servational errors. To incorporate this randomness, we consider the following 
nonparametric model: 

( 2 . 1 ) m = C(xi) -hei, 

where ((■) is an unknown deterministic function and is a sequence 

of independent and identically distributed random variables with Eci = 0 
and Eef = < -|-oo. This model is also adopted by Kennedy and O’Hagan 

(2001), where ((■) is called the true process. In addition, Kennedy and O’Hagan 
assumes that e^’s follow a normal distribution. Such a distribution assump¬ 
tion will be slightly relaxed in our theoretical analysis. 
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Let 0 be the parameter space for the calibration parameter 0. Suppose 
0 is a compact region in R"?. Denote the output of the deterministic com¬ 
puter code at (a;,0) G D x 0 by y^{x,6), where the superscript s stands 
for “simulation”. In the frequentist framework of calibration established by 
Tuo and Wu (2014), the concept of L 2 projection plays a central role. Be¬ 
cause the “true” calibration parameter (as stated in Kennedy and O’Hagan 
(2001)) is unidentifiable, Tuo and Wu (2014) defines the purpose of calibra¬ 
tion as that of finding the L 2 projection 9* which minimizes the L 2 distance 
between the physical response surface and the computer outputs as a func¬ 
tion of the control variables. In the present context, the physical responses 
are observed with errors. A good definition of the “true” value of 6 should 
exclude the uncertainty in y^. Thus we suggest the following definition for 
the L 2 projection using the true process C(')' 

(2.2) 9* :=argmin||C(-)-!/"(•, 0 )||L 2 (f^)- 

eee 

The main focus of this article is on the statistical inference for 9*. 

2.1. L 2 Calibration. In this section, we will extend the L 2 calibration 
method proposed by Tuo and Wu (2014) to the present context. Since Tuo and Wu 
assumes that the physical experiment is deterministic, they use the kernel 
interpolation method to approximate the physical response surface. Because 
of the existence of the random error in (2.1), the kernel interpolation can per¬ 
form poorly because interpolation methods generally suffer from the problem 
of overhtting. 

In spatial statistics, the effect of the random error is usually modeled 
with a white noise process, which is also referred to as a nugget term in the 
kriging modeling (Cressie, 1993). In the computer experiment literature, it is 
also common to use the nugget term in Gaussian process modeling to tackle 
the numerical instability problems (Gramacy and Lee, 2010; Peng and Wu, 

2014). 

Let z{-) be a Gaussian process with mean zero and covariance function 
‘k(., •). Suppose {{xi,yi)}\^i are obtained, which satisfy yi = z{xi) -\-ei with 
Cj’s being i.i.d. and distributed as A^(0, cr^). Then the predictive mean of z{-) 
is given by 

n 

(2.3) z{x) = y^^Ui^{xi,x), 

i=l 

where u = (ui,..., Un)"^ is the solution to the linear system 

(2.4) y = ($ + a‘^l)u, 
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with Y = {yi,... ,yn)'^ and $ = {^{xi,Xj))ij. By the representer Theorem 
(Wahba, 1990; Scholkopf, Herbrich and Smola, 2001), z{x) given by (2.3) 
and (2.4) is the solution of the following minimization problem with some 
A > 0: 

1 ” 

(2.5) argmin - ^(y^ - /(xi))^ + A||/||^ 

where || • ||A/'„.(n) is the norm of the reproducing kernel Hilbert space A/$(n) 
generated by the kernel function <^. We refer to Wendland (2005) and Wahba 
(1990) for detailed discussions about these spaces. The solution to (2.5) is 
referred to as the nonparametric regressor in the reproducing kernel Hilbert 
space (Berlinet and Thomas-Agnan, 2004). 

Now we are ready to define the L 2 calibration method for systems with 
stochastic physical experiments. Suppose the physical experiment is con¬ 
ducted over a design set {xi ,... ,Xn}- Dehne 

1 "" 

(2.6) C := argmin - ^(yf - fixi)f -L A||/||^ 

where the smoothing parameter A can be chosen using certain model selec¬ 
tion criterion, e.g., generalized cross validation (GCV). See Wahba (1990). 
We define the L 2 calibration for 9 as 

0^2 := argmin IIC(-) - f{-,9)\\L2in), 
eee 

where y^(-, •) is an emulator for the computer code y^(-, •). In this work, the 
emulator for the computer model can be constructed by any method pro¬ 
vided that it approximates y® well. For instance, y^ can be constructed by 
the radial basis function approximation (Wendland, 2005), Gaussian process 
models (Santner, Williams and Notz, 2003) or the polynomial chaos approx¬ 
imation (Xiu, 2010). 

3. Asymptotic Results for L 2 Calibration. We now consider the 
asymptotic behavior of 9^^ as the sample size n becomes large. For mathe¬ 
matical rigor, we write 

1 "" 

(3.1) C„ = argmin - ^(yf - fixi)f + An||/||^ 

for all sufficiently large n, where {A„}(h^ is a prespecified sequence of positive 
values. For the ease of mathematical treatment, we assume the Xj’s are a 
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sequence of random samples rather than fixed design points. We also write 
the L 2 calibration estimator indexed by n as 

(3-2) := avgmmWCni-) - yi{-,0)\\L2(n), 

eee 

where the emulator y* is also indexed by n. We assume that y* has increasing 
approximation power as n becomes large. 

3.1. Asymptotic Results for Cn- Before stating the asymptotic results 
for 0 ^ 2 ^ we need first to show that Cn tend to C- To study the convergence, 
we need some additional definitions from the theory of empirical processes 
(Kosorok, 2008). For function space T over fl, define the covering number 
N{6,iF, II • ||L^(r 2 )) as the smallest value of N for which there exist func¬ 
tions /i,..., /at, such that for each / G J", ||/ - fj\\L^{n) < ^ for some j G 
{1,..., A^}. The L 2 covering number with bracketing || • 11 ^ 2 ( 0 )) is 

the smallest value of N for which there exist L 2 functions 
with Wfj^ — fj"\\L 2 {n) ^ <5) J = 1, ■ ■ ■, such that for each f ^ T there exists 
a f such that < f < f^. 

We now state a result for general nonparametric regression. Suppose 
is a space of functions over a compact region 12 equipped with a norm || • ||. 
Suppose the true model is 

(3.3) Vi = fo{xi) + Ci, 

and Xi are i.i.d. from the uniform distribution 17(12) over 12. In addition, the 
sequences {xi} and {cj} are independent and has zero mean. We use 
to denote that the left side is dominated by the right side up to a constant. 
Let 

1 " 

(3.4) fn = argmin - ^(y* - /(xj))^ -h An||/f, 

/e.F 

for some A„ > 0. 

Lemma 1. Under the model (3.3), suppose /o G Let F{p) := {/ G 
-T : ll/ll < p}. Suppose there exists Cq > 0 such that Ll[exp(C'o|ej|)] < 00 . 
Moreover, there exists 0 < r < 2 such that 

logA^[](<5, J’(y), II • IIljC^)) ^ 

for all 6,p > 0. Then if the estimator /„ given by (3-4) 

satisfies 

WfnW = Op{l), and Wfn - /o||i, 2 (n) = Op{Xl/'^). 
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Proof. See van de Geer (2000). □ 

The covering numbers for some reproducing kernel Hilbert spaces have 
been calculated accurately in the literature. For instance, consider a Mat&n 
kernel function given by 

(3.5) - t\\y {2^/l^(j)\\s - t\\) , 

with > 1 (Stein, 1999; Santner, Williams and Notz, 2003). The reproduc¬ 
ing kernel Hilbert space generated by this kernel function is equal to the 
(fractional) Sobolev space F7^+'^/^(H), and || • ||a/'.^(q) and || • are 

equivalent. See Corollary 1 of Tuo and Wu (2014). Let := {/ : 

WfWni^in) < p}- Edmunds and Triebel (2008) proves that for p > d/2, the 
covering number of is bounded by 

logiV(5,H^^(L!,p),||.||i^(o))<(^)"^^ 

where C is independent of p and 6. To calculate the L 2 metric entropy with 
bracketing, we note the fact that every /,/' E H^{^l,p) with ||/ — f'\\ < 5 
satisfy the inequality f — S < f < f + S. Thus the union of the 5-balls 
centered at /i is covered by the union of the “brackets” [fi — 6, fi + 

5], ...,[/„ — 5, /n -F 5], which, together with the definition of the covering 
number and the L 2 covering number with bracketing, implies that 

(3.6) logNy26yV^),H^{n,p),\\ • ||i,(n)) < 

where Vol{id) denotes the volume of H, and 25y^Ho/(H) is the L 2 {^) norm 
of the function 25. Then by applying Lemma 1, the following result can be 
obtained after direct calculations. 

Proposition 1. Under the model (3.3), suppose fo £ T = AF<j,(H) and 
A/$(H) can be embedded into H^(H) with p > d/2. Choose || • || = || • ||A/'#(r 2 )- 
Then for the estimator fn given by (3.4) satisfies 

\\fn\\Af.f,{n) = Op{l), and ||/„ - /o||i, 2 (n) = Op{\j/^). 

In Proposition 1, one can choose A x ? 7 ,“ 2 Ai/( 2 /i-i-d) obtain the best 
convergence rate ||/n — /o||l 2 (o) = where “x” denotes that 

its left and the right sides have the same order of magnitude. This rate is 
known to be optimal (Stone, 1982). 
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3.2. Asymptotic Normality. The main purpose of calibration is to es¬ 
timate the calibration parameter 6*. In this section, we will prove some 
convergence properties of the L 2 calibration: its convergence rate is given by 
ll^n^ — 0*11 = and the distribution of y/n{6^^ — 9*) tends to nor¬ 

mal as n —)• 00 under certain conditions. This is a nontrivial result because 
the convergence rate for the nonparametric part ||Cn(') — CllL 2 (t 2 ) generally 
slower than (see Proposition 1). 

We first list necessary conditions for the convergence result, which are 
grouped in three categories. 

The first group consists of regularity conditions on the model. For any 
0 € 0 C R'^, write 0 = (0i,..., 6q). 


Al: 


A2 

A3 

A4 

A5 


The sequences {xi} and {cj} are independent; Xj’s are i.i.d. from f7(n); 
and {cj} is a sequence of i.i.d. random variables with zero mean and 
finite variance. 

0 * is the unique solution to ( 2 . 2 ), and is an interior point of 0 . 


SUp0g0 

V :=E 


0 ^(-, 61 ) 11 ^ 2 ( 0 ) <+ 00 . 


is invertible. 


There exists a neighborhood U C 0 of 0*, such that 


sup 

e&u 


9) 

90 /’ ^ 


mn; ^ BOiBO, 


{v)£C(f!xf/), 


for all 9 G U and j,k = 1,... ,q. 


Next we need some conditions on the nonparametric part. 

Bl; C G J\f^{Q) and A/$(fI,/?) is Donsker for all p > 0. 

B2: lie - Clli,2(f^) = Op{l). 

B3: ||C||a/-,(o) = Op{l). 

B4: An = Op(n“^/^). 

The Donsker property is an important concept in the theory of empirical 
processes. For its definition and detailed discussion, we refer to van der Vaart and Wellner 
(1996) and Kosorok (2008). One major result is that a class of functions over 
domain D, denoted as E, is Donsker, if 


log N[]{6, E, L 2 {0.))d6 < - 1 - 00 . 

Thus from (3.6) we can see that if A/#(D) can be embedded into for 

some pL > d/ 2 , A/'<i>(D) is Donsker. Actually if we further assume condition Al 
and Fl[exp(C'|ej|)] < -|-oo for some C > 0, the conditions of Proposition 1 are 
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satisfied. Then by choosing a suitable sequence of {A„}, say A x , 

one can show that condition B4 holds and condition B2 and B3 are ensured 
by Proposition 1. 

Finally we need to assume some convergence properties for the emulator. 
In this work, we assume that the approximation error caused by emulat¬ 
ing the computer experiment is negligible compared to the estimation error 
caused by the measurement error in the physical experiment. Under this as¬ 
sumption, the asymptotic behavior of 9^'^ — 6* is determined by the central 
limit theorem. Given that computer experiment is usually much cheaper to 
run than physical experiment, such an assumption is reasonable because the 
size of computer runs is in general much larger than the size of physical 
trials. 


Cl: 

C2: 


Vn - y""h^inxe) = Op{n 


dy‘ 

Ml 


y Ill,oo(nx0) - ^p\ 

= Op(n“^/^), ioT i = 1,... ,q. 


dy” 

Ml 


Loo(f^X©) 

Now we are ready to state the main theorem of this section on the asymp¬ 
totic normality of the L 2 calibration. 


Theorem 1. Under conditions A1-A5, BI-B 4 , and C1-C2, we have 
(3.7) §t-e* = -2U-1 |i^e,^(xi,r)| +Op(n-V2), 

where V is defined in condition A4- 

Proof. We first prove that 9n^9*. From the definitions of 9* and 9(^^ 
in (2.2) and (3.2), it suffices to prove that ||Cn(‘) “yn(') ^)llL 2 (f^) converges to 
IIC(') “ y^(') ^)llL 2 (f 2 ) uniformly with respect to 0 G 0 in probability, which 
is ensured by 

ii4(-)-r(-,0)iiL(o)-iic(-)-i/^(-,^)iiL(o) 

= I (Cn{z)-Ciz)-y%z,9) + y%z,9)J (Cniz) + Ciz)-y%z,9)-y%z,9)J 
J 

< (llCn - CllL2(n) + lly^(')^) “ y^(')^)llL2(n)) ■ 

^^<^X-)IIt 2(!^) + IIC(-)IU2(f^) + lly^(')^)llL2(n) + lly*(-)^)llL2(!^)) > 

where the inequality follows from the Schwarz inequality and the triangle 
inequality. Denote the volume of D by Vol{Vl). It is easily seen that 
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holds for all / S Loo{^)- Thus 

(3-9) < \/^o/(n)||y^ - y^||Loo(r2x0)- 

Additionally, we have 

IICn||L2(n) < ^o/(f^)||Cn||L^(n) = l^o/(0) SUp (Cn, ^, 3;)) 
(3.10)< l^o/(i7)||Cn||AA(fl)SUp||^>(-,x)||Ars(0) = '1^0^(^^)IICn||AA(fl)- 


Combining (3.9), (3.10), B2, and Cl, we have that (3.8) convergence to 0 
uniformly with respect to 0 G 0, which yields the consistency of 
Since 0 minimizes (3.2), following Al, A2 and A5 we have 



which, together with B2, Cl and C2, implies 

(3-11) (Cn{z) - y^{z, ^t)dz = Op(n-^/2)_ 


Let Lnif) = n ^YA=i{Vi “ f{xi)f + An||/||^^(s^)- By (2.6), C„ minimizes 
Ln over A/',i)(fl). Since 0 ^^ ig consistent, by A5, f|-(-,^,^^) G for 

j = and sufficiently large n. Thus we have 


0 = 


(3.12) 






i=l 3 i—i 3 

‘^(Cn + Dn + En). 


First, we consider Cn- Let Ai{g,6) = {g{xi) — C{xi)}^^{xi,9) for {g,0) G 
A/’$(n,p) X U with some p > 0 to be specihed later. Then E[Ai{g,9)] = 
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L{9{z) - C{z)]^{z , 6 )dz. Define the empirical process 


9) = n-V2 9) - E[Mg, 0)]}. 

i=l 


By Bl, is Donsker. Thus = {g — C ■ 9 ^ is also 

Donsker. Condition A5 implies that J -2 = {§!“(•) ^) : ^ C 17} is Donsker. 
Since both Ti and T 2 are uniformly bounded, the product class Ju x E 2 is 
also Donsker. For theorems on Donsker classes, we refer to Kosorok (2008) 
and the references therein. Thus the asymptotic equicontinuity property 
holds, which suggests that (see Theorem 2.4 of Mammen and van de Geer 
(1997)) for any ^ > 0 there exists a <5 > 0 such that 


lim sup P 

n —>00 


sup 

\/GJ-iXJF2,||/||<<5 


1 " 

~ ^(/(^*))) 

V ^ • 1 

t=l 



<e, 


where || • || is defined as ||/|p := E[f(xi)]‘^. This implies that for all ^ > 0 
there exists a 5 > 0 such that 


(3.13) limsupPj sup |F;i„( 5 ,6»)| > .^ ] < ^. 

n^oo \geAA(n,p),0gC,||g-CIU2(n)<5 / 

Now fix e > 0. Condition B3 implies that there exists po > 0, such that 
-f’(llCn||A/'*(o) > Po) ^ Choose Jq to be a possible value of 5 satisfying 
(3.13) with p = Po and ^ = s/3. Dehne 


Sn • 



if llCn|k«(n) < Po and ||Cn - Clliaio) < <^0 
elsewise 


Therefore, for sufficiently large n we have 
P{\Ein{L9t)\>e) 

< P(|^ln(C, 0>)| > e) + P(||Cn|k,(II) > Po) + T’dICn " ClU^fO) > ^o) 

< P(|^in(C, 9/:^)\ > e/3) + e/3 + e/3 

< p{ sup \Ein{g,6)\> e/3\ +s/3 + s/3 

\gGA^(r2,po),6»eC/,||g-C||i2(n)<5o / 

< e, 


where the first and the third inequalities follows from the definition of Cn', 
the second inequality follows from B2; the last inequality follows from (3.13). 
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This implies that Ein{Cn,0n^) tends to zero in probability. Thus we have 


Op(l) = EiniLJn^) = n 


2 = 1 


de, 




= /" - C{z)}^{z,e^^)dz, 

J Q 

which implies 

(3.14) Cn = j^{(n{z) - Ciz)}^{zJ^^)dz + Op{n-^/‘^). 

By substituting (3.11) to (3.14) and using A2, we can apply the Taylor 
expansion to (3.14) at 6* and obtain 

Cn = [ {y"{z,9j^^) - C{z)}^{z,e^^)dz + Op{n-C‘^) 


in 


ddi 


y%z,en)-Ciz)ydz\{e^^-e*) 


1 r 

2j^dWMj 

(3.15) +Op(n"^/^), 

where 9n lies between 9n and 9*. By the consistency of we have 9n A 9* 
This implies that 

r 

Jn 99^09 


y‘'{z,9n) - C{z)) dz 


(3.16) 


/ 

Jn 




{y%z,9*)-C{z))^dz = V. 


89^89 

Now we consider Dn- Define the empirical process 
E2n{9) 


= n- 1 / 2 ' 


2=1 

n 


8y 


8y^ 




2=1 


89 

8y^ 


89, 

8y^ 




= n j ei^(xi,0 )-ei^(xi,r) 


where 9 & U. By A5, the set {fe G C(R x Q) : f 0 {e,x) = e^-{x,9) — 
e^-{x,9*),9 G [/} is a Donsker class. This ensures that E 2 n{-) weakly con¬ 
verges in Loo{U) to a tight Guassian process, denoted by G(-). Without loss 
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of generality, we assume that G'(-) has continuous sample paths. We note 
that G{6*) = 0 because E 2 n{G*) for all n. Then as a consequence of the con¬ 
sistency of and the continuous mapping theorem (van der Vaart, 2000), 
E 2 n{Gn^) G{9*) = 0, which gives 


(3.17) Dn = ^'^ei^{xi,6*) + Op{n fo^). 

i=l ^ 

Finally we estimate CA. By A5, B2 and B3, By C2, C3 and C4, 


(3.18) 


i^n < A„||C|k,(Q) = Op(n-fo2) 


AT (II) 


By combining (3.12), (3.15), (3.16), (3.17) and (3.18), we prove the desired 
result. □ 


Theorem 1 implies the asymptotic normality of y/n{9^^ 
that 




(3.19) W :=E 

is positive dehnite. Specihcally, 

(3.20) Vn{0n^ - 0*) A iV(0, Aa'^y-^WV-^). 


9*), provided 


3.3. Semiparametric Efficiency. In this section, we discuss the efficiency 
of the proposed L 2 calibration. It will be shown that, as a semiparametric 
method, the L 2 calibration method reaches the highest possible efficiency if 
the measurement errors follow a normal distribution. 

In statistics, a parametric model is one whose parameter space is hnite di¬ 
mensional, while a nonparametric model is one with an infinite dimensional 
parameter space. The definition of semiparametric models is, nevertheless, 
more complicated. Refer to Groeneboom and Wellner (1992); Bickel et al. 
(1993) for details. In simple terms, a semiparametric problem has an infinite 
dimensional parameter space but the parameter of interest in this problem 
is only finite dimensional. The calibration problems under consideration are 
semiparametric. To see this, consider the calibration model given by (2.1) 
and (2.2). The parameter space of model (2.1) contains an infinite dimen¬ 
sional function space which covers C- On the other hand, the parameter of 
interest is 9* in (2.2), which is g-dimensional. 

Now we briefly review the estimation efficiency in semiparametric prob¬ 
lems. For details, we refer to Bickel et al. (1993); Kosorok (2008). Let H be 
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an infinite dimensional parameter space whose true value is denoted by ^o- 
Denote the feature of interest as i'{Co) with a known map i/ : H i—> R'^. 
Suppose Tn is an estimator for i^{Co) based on n independent samples and 
that y/n[Tn — z^(^o)) is asymptotically normal. Now let Hq be an arbitrary 
hnite dimensional subset of H satisfying £ ^o- We consider the statisti¬ 
cal estimation problem with the same observed data but with the parame¬ 
ter space Hq. Under this parametric assumption and some other regularity 
conditions, an efficient estimator can be obtained by using the maximum 
likelihood (ML) method, denoted by Since the construction of uses 
more assumptions than T„, the asymptotic variance of should be less 
than or equal to that of T^. We call T„ semiparametric efficient if there 
exists a Hq such that has the same asymptotic variance as T„. 

For the calibration problem given by (2.1) and (2.2), consider the following 
q'-dimensional parametric model indexed by 7 : 

(3.21) «-)=C(-)+ 7 ’’^(-,n, 

with 7 S R'^. Then (2.1) and (3.21) form a linear regression model. Regard¬ 
ing ( 2 . 1 ), the true value of 7 is 70 = 0. Suppose that in ( 2 . 1 ) follows 
A^(0, cj^) with an unknown < 7 ^. Under the regularity conditions of Theorem 
1, the ML estimator for observations is the least squares esti¬ 

mator, with the asymptotic expression 

(3.22) % = + Op(n"^/^), 

n 00 

i=l 

where W is dehned in (3.19). Then a natural estimator for 9* in (2.2) is 

(3.23) On = argmin||C 7 ^(-) - y\-,9)\\L2{n)- 

eee 

Again, we simplify the problem in (3.23) by assuming that is a known 
function. The asymptotic variance of 9n can be obtained by the delta method. 
As in Al, assume that xt follows the uniform distribution over D. Then 
ll/llLco) = for all /• Dehne 

(3.24) e{t) = argmmE[Ct{xi) -y%Xi,e)f, 

6»ee 

for each t near 0. Let 
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Then (3.24) implies = 0 for all t near 0. From the implicit function 

theorem, we have 



(3.25)= -2V-^W. 


By the delta method, 


(3.26) 0n-9* = mn) - m = 


which, together with (3.22) and (3.25), yields 



2 = 1 


Noting that the asymptotic expression of the L 2 calibration given by (3.7) 
has the same form as the ML estimator for the parametric model in (3.27), 
we obtain the following theorem. 

Theorem 2. Under the assumptions of Theorem 1, if ei in (2.1) fol¬ 
lows a normal distribution, then the L 2 calibration (3.2) is semiparametric 
efficient. 

Since the normal distribution is commonly used to model the random 
error in physical experiments (see e.g. Wu and Hamada, 2011) and the cal¬ 
ibration for computer experiments (see e.g. Kennedy and O’Hagan, 2001). 
Theorem 2 suggests that the proposed method is efficient for many practical 
problems. For non-normal error distributions, the ML estimator does not 
agree with the least squares estimator. Thus the ML estimator cannot be 
expressed by (3.22). Consequently, the L 2 calibration defined by (3.1) and 
(3.2) is not semiparametric efficient. However, if the random error is from a 
parametric model, the proposed L 2 calibration can be modified to achieve 
the semiparametric efficiency. Denote the likelihood function of by /(/3; Cj) 
with (3 € B, i.e, e* has a density l{/3o‘, •) for some unknown /3q G B. Suppose 
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L(-; x) := log Z(-; x) is convex for all x. Then the penalized ML estimator for 

C is 

1 " 

(3.28) := argmin - ^ L(/3; y* -/(x*)) + A||/||^ 

0ei3jeAf^{n) ^ 

Then define the modified L 2 calibration as 

(3.29) (9^^ ;= argmin IIC^^(-) -y^(-, 6 ')||i 2 (Q). 

eee 

By using similar arguments, it can be proved that, under some regularity 
conditions, 9^^ is semiparametric efficient. For a related discussion, we refer 
to Shen (1997). 

4. Ordinary Least Squares. In this section, we will study an al¬ 
ternative method, namely, the ordinary least squares (OLS) calibration. 
There are several versions of the OLS method discussed in statistics and 
applied mathematics for calibration problems and inverse problems (e.g., 
Joseph and Melkote, 2009; Evans and Stark, 2002). Here we consider a gen¬ 
eral form, which is apparently new but covers the existing versions. As 
before, let y® be a sequence of surrogate models for Define the OLS 
estimator for the calibration parameter as 

n 

(4.1) 9^^^ = argmin^ (yf - y^(xi, 9)f , 

where x^’s and y^’s are from the model ( 2 . 1 ). 

Obviously, the OLS calibration is a natural choice when there is no dif¬ 
ference between the true process and the optimal computer output, i.e., 
C(') = 2 /*(')^*)- However, we are particularly interested in the asymptotic 
behavior of the OLS calibration when ^(•) and y^{-,9*) are different. 

Analogous to Theorem 1 for the L 2 calibration, we have the following 
theorem on the asymptotic behavior of the OLS calibration. 

Theorem 3. In addition to conditions AI-A 4 and C1-C2, suppose that 
there exists a neighborhood U of 9*, such that y^(x,-) G C'^’^{U) for all 
X G D, where denotes the space of functions whose second derivatives 
are Lipschitz. Then 

qOLS _ 0* = y-1 |l ^ _ y^(a:„r))2| + Op(n-V2). 


imsart-aos ver. 2011/11/15 file: calibration_stochastic.tex date: July 28, 2015 


EFFICIENT CALIBRATION 


17 


Proof. First we prove 6^^^ ^ 9*. By condition A2, it suffices to show 
that 


(4.2)sup 

eee 




2=1 


Note that 

n 1 ^ 

ynixi, d)f --J2^yi- 

2=1 2=1 

1 "" 

= - '^{y^{xi,0) - y^{xi,e)){2yf - y^{xi,e) - y^{xi,e)) 

2=1 

/ n 

(4.3) < + 

V i=l 

Since 0 is compact, the uniform law of large numbers (van der Vaart and Wellner, 
1996) implies 


(4.4) sup 
eee 


n 


2=1 


and 


(4.5) sup 
6 »ee 


n 


E(2/f - y"{xi, 0)f - E[yl - y^Xi, 0)]' 


2=1 


■ 0 . 


Direct calculations give 

(4.6) E[yf-y^{xi,9)f = [ (Ciz) - y"{z,9))‘^dz + , 

Jn 

which, together with (4.3), (4.4), (4.5) and (4.6), proves (4.2). 

By the definition (4.1), condition A2 and the consistency of 9^^^, we have 



2=1 
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which, together with the law of large numbers and conditions Cl and C2, 
yields 



The remainder of the proof follows from some direct calculations using the 
standard asymptotic theory for Z-estimators (van der Vaart, 2000). □ 


Theorem 3 shows that the OLS calibration is consistent even if the com¬ 
puter code is imperfect. Compared with the L 2 calibration, the OLS calibra¬ 
tion is computationally more efficient. Also, the OLS calibration does not 
require tuning, while in the L 2 calibration the value of the tuning parameter 
A in (2.6) needs to be determined. However, according to Theorem 3, the 
asymptotic variance of the OLS calibration does not reach the semiparamet- 
ric lower bound given by (3.27). 

We now study the conditions under which the L 2 calibration and the 
OLS calibration are asymptotically equivalent. Let Si = dcr^VL. Then the 
asymptotic variance of the L 2 calibration given by (3.20) is . Let 


So = 


E 

AE 






{ei + C{xi) -y"{xi,9*)) -^{xi,9*)-^{xi,9*) 


(4.7) 


= Aa^W + AE 


3*112< 92 /%^ /,*i^2/^ 


{C{xi)-y^{xi,9*)y-^ixu9*)-^{xi,9* 


Then Theorem 3 shows that the asymptotic variance for the OLS calibration 
is V~^'S 2 V~^. From (4.7), it is seen that S 2 —Si > 0. Additionally, Si = S 2 
if and only if 


(4.8) E 


■,2dy" 


dy^ 


{C{xi) - yyx,,9*)Y^{xu9*)-^{x,,9*) 


d9 


= 0 . 


Suppose ^{x,9*) y 0 for all x G H. Then (4.8) holds only if C(x) = 
y®(x, 9*) for almost every a; G H, i.e., there exists a perfect computer model. 
In this case, the OLS calibration has the same asymptotic distribution as 
the L 2 calibration. However, as suggested by Kennedy and O’Hagan (2001), 
the bias between y^(-, 9*) and C(') can be large in practical situations. Thus 
in general the OLS calibration is less efficient than the L 2 calibration. 
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5. Numerical Studies. In this section, we compare the numerical be¬ 
haviors of three methods for the estimation of the calibration parameters: the 
L 2 calibration, the OLS calibration, and a version of the method proposed 
by Kennedy and O’Hagan (2001). The original version of the Kennedy- 
O’Hagan (abbreviated as KO) method is a Bayesian approach. In order to 
compare with the proposed frequentist methods, we consider the frequentist 
version of the KO method stated in Tuo and Wu (2014), where the maxi¬ 
mum likelihood estimation is used. 

5.1. Example 1: perfect computer model. Suppose the true process is 
(5.1) C(3^) = exp(x/10) sin X, 

for X G 0 = (0, 27r). The physical observations are given by 
(5-2) yf = C{xi) + ei, 

with 

(5.3) Xi = 27ri/50, Cj ~ N{0, a^), for z = 0,..., 50. 

We will consider two levels of cr^ (with = 0.1 and = 1) so that the 
numerical stability of the methods with different noise levels is investigated. 

Suppose the computer output is 

(5.4) y^{x,6) = C(t) — |0 -I- l|(sin0x -|- cos0x). 

Then we have f{-) = y^{-,—l). Thus 9* = —1. And there is no discrepancy 
between C(-) and y^{-,9*), i.e., the computer model is perfect. For simplicity, 
we suppose that (5.4) is a known function so that we do not need an emulator 
for it. 

We conducted 1000 random simulations to examine the performance of 
the L 2 calibration, the OLS calibration, and the KO calibration for = 
0.1 and = 1 respectively. For the L 2 calibration and the KO calibra¬ 
tion, the Gaussian correlation family $(xi,X 2 ) = exp{—0(xi — X 2 )^} is 
used with the model parameter (f chosen by the cross-validation method 
(Santner, Williams and Notz, 2003; Rasmussen and Williams, 2006). The 
tuning parameters in the nonparametric regression is selected by the gener¬ 
alized cross validation (Wahba, 1990). 

Table 1 shows the simulation results. The results for cr^ = 0.1 and cr^ = 1 
are given in columns 2-3 and 4-5 respectively. The true values of 9* are given 
in the second row. The last three rows give the mean value and the mean 
square error (MSE) over 1000 random simulations for the three methods. 
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Table 1 

Numerical comparison for perfect computer model. MSE = mean square error. 



= 0.1 


True Value 

-1 

-1 


Mean 

MSE 

Mean 

MSE 

L2 

-0.9990 

6.497 X 10“’" 

-0.8876 

0.0906 

OLS 

-0.9999 

1.160 X lO”'^ 

-0.9306 

0.0908 

KO 

-0.9993 

8.065 X lO”” 

-0.9325 

0.0468 


It can be seen from Table 1 that all three methods give good estima¬ 
tion results in this example. The good performance of the KO method is 
not surprising because the computer model here is perfect. In their theo¬ 
retical study on the KO method with deterministic physical experiments, 
Tuo and Wu (2014) obtained the limiting value of the KO method under 
certain conditions. Using Theorem 1 of Tuo and Wu (2014), it can be seen 
that, for deterministic physical experiments, the Kennedy-O’Hagan method 
would be consistent if the computer model is perfect. The simulation results 
in this example suggest that this statement may also hold for stochastic 
physical systems. 

5.2. Example 2: imperfect computer model. Now we consider an exam¬ 
ple with an imperfect computer model. Suppose the true process and the 
physical observations are the same as in Example 1, given by (5.1), (5.2), 
and (5.3). Suppose the computer model is 

(5.5) y\x, 9) = C(x)- -9 + 1 (sin 9x -t- cos 9x). 

As in Example 1, we suppose is known. Erom (5.5), it can be seen that 
there does not exist a real number 9 satisfying y’^{-,9) = C(')! because the 
quadratic function 9‘^ — 9 + 1 is always positive. Thus, this computer model 
is imperfect. 

The L 2 discrepancy between the computer model and the physical model 
has an explicit form: 

(5.6) IK - !/'(■,9)|li,,n, = (9= - 9 + 1) ( 2 ^ - ~ ^ 

with a continuous extension at 0 = 0. Eigure 1 plots the function (5.6) with 
—2 < 9 < 2. Numerical optimization shows that the minimizer of (5.6) is 
9* « -0.1789. 

As in Example 1, we conducted 1000 random simulations to compare the 
L 2 calibration, the OLS calibration, and the KO calibration. We keep the 
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theta 


Eig 1. 1/2 discrepancy function in Example 2. 


remaining setup of this experiment the same as in Example 1. The mean 
value and standard deviation (SD) over 1000 simulations are shown in Table 

2 . 


Table 2 

Numerical comparison for imperfect computer model. SD = standard deviation. 



= 0.1 

2 ^ 

CT = 1 

True Value 

-0.1789 

-0.1789 


Mean 

SD 

Mean 

SD 

1/2 

-0.1792 

2.665 X lO”'" 

-0.1773 

0.0711 

OLS 

-0.1770 

2.674 X 10“^ 

-0.1684 

0.1060 

KO 

-0.1224 

7.162 X 10"^ 

0.0034 

0.3244 


It can be seen from Table 2 that the L 2 calibration and the OLS calibra¬ 
tion outperform the KO calibration. Furthermore, the mean value of the KO 
estimator changes a lot as changes. This is undesirable because a good 
estimator should not be sensitive to random error for large samples. Table 2 
also shows that the standard deviation of the L 2 calibration is smaller than 
that of the OLS calibration. This agrees with our theoretical analysis, which 
shows that the L 2 calibration is more efficient than the OLS calibration for 
imperfect computer models. Overall, the KO calibration underperforms the 
L 2 calibration or the OLS calibration. 

6. Concluding Remarks and Further Discussions. In this work, 
we extend the framework established in Tuo and Wu (2014) to stochastic 
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physical systems. We propose a novel method, called the L 2 calibration, and 
prove its asymptotic normality and semiparametric efficiency. We also study 
the OLS method and prove that it is consistent but not efficient. Although 
the OLS calibration is computationally less costly, the L 2 calibration should 
be seriously considered because of its high estimation efficiency. By using 
a more efficient estimator, fewer physical trials are needed to achieve the 
same estimation efficiency. In most practical problems, physical experiments 
are more expensive to run. Therefore it would be worthwhile to save the 
physical runs by doing more computation. Thus we recommend using the 
L 2 calibration over the OLS calibration. 

Because of the identifiability problem in calibration, we define the purpose 
of calibration as that of finding the L 2 projection, i.e., the parameter value 
which minimizes the discrepancy between the true process and the computer 
output under the L 2 norm. Noting that the “true” value of the calibration 
parameter in our framework depends on the choice of the norm, one may 
also consider the asymptotic results for calibration under a different norm. 
After some calculations, it can be shown that the main results of this work 
still hold if the new norm is equivalent to the L 2 norm. However, if a norm 
that is not equivalent to the L 2 norm, such as the norm, is used, the 
idea in the proof of Theorem 1 will not work. We believe that, for those 
norms, there do not exist estimators with convergence rate This 

will require further work. 

We have reported the asymptotic properties of the L 2 calibration un¬ 
der the random design, i.e., Xi are sampled independently from the uni¬ 
form distribution. Given the fact that many physical experiments are con¬ 
ducted under hxed designs (see books by Box, Hunter and Hunter, 2005 and 
Wu and Hamada, 2011), the results for calibration for fixed designs need 
further investigation. 
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